Análisis exploratorio de Datos para Modelos de Pronóstico y Detección de Anomalías¶

Christopher Yahir Moreno Moreno¶

In [1]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns 
from datetime import datetime

plt.style.use('ggplot')
pd.set_option('display.max_columns', None)
In [2]:
from utils import describe_dataset
# 1.1 Carga y descripción de los datos
electrical_data = pd.read_csv('electrical_data.csv')
environment_data = pd.read_csv('environment_data.csv')
irradiance_data = pd.read_csv('irradiance_data.csv')

irradiance_data = pd.read_csv("irradiance_data.csv", parse_dates=["measured_on"])

for col in irradiance_data.columns:
    if col != "measured_on":
        irradiance_data[col] = pd.to_numeric(irradiance_data[col], errors="coerce")

describe_dataset(electrical_data, 'Electrical Data')
describe_dataset(environment_data, 'Environment Data')
describe_dataset(irradiance_data, 'Irradiance Data')
/home/chris/.local/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
==================================================
Descripción del dataset: Electrical Data
==================================================
Número de filas: 632952
Número de columnas: 120

Primeras 5 filas:
measured_on inv_01_dc_current_inv_149579 inv_01_dc_voltage_inv_149580 inv_01_ac_current_inv_149581 inv_01_ac_voltage_inv_149582 inv_01_ac_power_inv_149583 inv_02_dc_current_inv_149584 inv_02_dc_voltage_inv_149585 inv_02_ac_current_inv_149586 inv_02_ac_voltage_inv_149587 inv_02_ac_power_inv_149588 inv_03_dc_current_inv_149589 inv_03_dc_voltage_inv_149590 inv_03_ac_current_inv_149591 inv_03_ac_voltage_inv_149592 inv_03_ac_power_inv_149593 inv_04_dc_current_inv_149594 inv_04_dc_voltage_inv_149595 inv_04_ac_current_inv_149596 inv_04_ac_voltage_inv_149597 inv_04_ac_power_inv_149598 inv_05_dc_current_inv_149599 inv_05_ac_current_inv_149601 inv_05_ac_voltage_inv_149602 inv_05_ac_power_inv_149603 inv_06_dc_current_inv_149604 inv_06_dc_voltage_inv_149605 inv_06_ac_current_inv_149606 inv_06_ac_voltage_inv_149607 inv_06_ac_power_inv_149608 inv_07_dc_current_inv_149609 inv_07_dc_voltage_inv_149610 inv_07_ac_current_inv_149611 inv_07_ac_voltage_inv_149612 inv_07_ac_power_inv_149613 inv_08_dc_current_inv_149614 inv_08_dc_voltage_inv_149615 inv_08_ac_current_inv_149616 inv_08_ac_voltage_inv_149617 inv_08_ac_power_inv_149618 inv_09_dc_current_inv_149619 inv_09_dc_voltage_inv_149620 inv_09_ac_current_inv_149621 inv_09_ac_voltage_inv_149622 inv_09_ac_power_inv_149623 inv_10_dc_current_inv_149624 inv_10_dc_voltage_inv_149625 inv_10_ac_current_inv_149626 inv_10_ac_voltage_inv_149627 inv_10_ac_power_inv_149628 inv_11_dc_current_inv_149629 inv_11_dc_voltage_inv_149630 inv_11_ac_current_inv_149631 inv_11_ac_voltage_inv_149632 inv_11_ac_power_inv_149633 inv_12_dc_current_inv_149634 inv_12_dc_voltage_inv_149635 inv_12_ac_current_inv_149636 inv_12_ac_voltage_inv_149637 inv_12_ac_power_inv_149638 inv_13_dc_current_inv_149639 inv_13_dc_voltage_inv_149640 inv_13_ac_current_inv_149641 inv_13_ac_voltage_inv_149642 inv_13_ac_power_inv_149643 inv_14_dc_current_inv_149644 inv_14_dc_voltage_inv_149645 inv_14_ac_current_inv_149646 inv_14_ac_voltage_inv_149647 inv_14_ac_power_inv_149648 inv_15_dc_current_inv_149649 inv_15_dc_voltage_inv_149650 inv_15_ac_current_inv_149651 inv_15_ac_voltage_inv_149652 inv_15_ac_power_iinv_149653 inv_16_dc_current_inv_149654 inv_16_dc_voltage_inv_149655 inv_16_ac_current_inv_149656 inv_16_ac_voltage_inv_149657 inv_16_ac_power_inv_149658 inv_17_dc_current_inv_149659 inv_17_dc_voltage_inv_149660 inv_17_ac_current_inv_149661 inv_17_ac_voltage_inv_149662 inv_17_ac_power_inv_149663 inv_18_dc_current_inv_149664 inv_18_dc_voltage_inv_149665 inv_18_ac_current_inv_149666 inv_18_ac_voltage_inv_149667 inv_18_ac_power_inv_149668 inv_19_dc_current_inv_149669 inv_19_dc_voltage_inv_149670 inv_19_ac_current_inv_149671 inv_19_ac_voltage_inv_149672 inv_19_ac_power_inv_149673 inv_20_dc_current_inv_149674 inv_20_dc_voltage_inv_149675 inv_20_ac_current_inv_149676 inv_20_ac_voltage_inv_149677 inv_20_ac_power_inv_149678 inv_21_dc_current_inv_149679 inv_21_dc_voltage_inv_149680 inv_21_ac_current_inv_149681 inv_21_ac_voltage_inv_149682 inv_21_ac_power_inv_149683 inv_22_dc_current_inv_149684 inv_22_dc_voltage_inv_149685 inv_22_ac_current_inv_149686 inv_22_ac_voltage_inv_149687 inv_22_ac_power_inv_149688 inv_23_dc_current_inv_149689 inv_23_dc_voltage_inv_149690 inv_23_ac_current_inv_149691 inv_23_ac_voltage_inv_149692 inv_23_ac_power_inv_149693 inv_24_dc_current_inv_149694 inv_24_dc_voltage_inv_149695 inv_24_ac_current_inv_149696 inv_24_ac_voltage_inv_149697 inv_24_ac_power_inv_149698
0 2017-11-01 00:00:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 2017-11-01 00:05:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 2017-11-01 00:10:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 2017-11-01 00:15:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 2017-11-01 00:20:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Resumen de tipos de datos y valores faltantes:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 632952 entries, 0 to 632951
Columns: 120 entries, measured_on to inv_24_ac_power_inv_149698
dtypes: float64(119), object(1)
memory usage: 579.5+ MB
None
Estadísticas descriptivas:
measured_on inv_01_dc_current_inv_149579 inv_01_dc_voltage_inv_149580 inv_01_ac_current_inv_149581 inv_01_ac_voltage_inv_149582 inv_01_ac_power_inv_149583 inv_02_dc_current_inv_149584 inv_02_dc_voltage_inv_149585 inv_02_ac_current_inv_149586 inv_02_ac_voltage_inv_149587 inv_02_ac_power_inv_149588 inv_03_dc_current_inv_149589 inv_03_dc_voltage_inv_149590 inv_03_ac_current_inv_149591 inv_03_ac_voltage_inv_149592 inv_03_ac_power_inv_149593 inv_04_dc_current_inv_149594 inv_04_dc_voltage_inv_149595 inv_04_ac_current_inv_149596 inv_04_ac_voltage_inv_149597 inv_04_ac_power_inv_149598 inv_05_dc_current_inv_149599 inv_05_ac_current_inv_149601 inv_05_ac_voltage_inv_149602 inv_05_ac_power_inv_149603 inv_06_dc_current_inv_149604 inv_06_dc_voltage_inv_149605 inv_06_ac_current_inv_149606 inv_06_ac_voltage_inv_149607 inv_06_ac_power_inv_149608 inv_07_dc_current_inv_149609 inv_07_dc_voltage_inv_149610 inv_07_ac_current_inv_149611 inv_07_ac_voltage_inv_149612 inv_07_ac_power_inv_149613 inv_08_dc_current_inv_149614 inv_08_dc_voltage_inv_149615 inv_08_ac_current_inv_149616 inv_08_ac_voltage_inv_149617 inv_08_ac_power_inv_149618 inv_09_dc_current_inv_149619 inv_09_dc_voltage_inv_149620 inv_09_ac_current_inv_149621 inv_09_ac_voltage_inv_149622 inv_09_ac_power_inv_149623 inv_10_dc_current_inv_149624 inv_10_dc_voltage_inv_149625 inv_10_ac_current_inv_149626 inv_10_ac_voltage_inv_149627 inv_10_ac_power_inv_149628 inv_11_dc_current_inv_149629 inv_11_dc_voltage_inv_149630 inv_11_ac_current_inv_149631 inv_11_ac_voltage_inv_149632 inv_11_ac_power_inv_149633 inv_12_dc_current_inv_149634 inv_12_dc_voltage_inv_149635 inv_12_ac_current_inv_149636 inv_12_ac_voltage_inv_149637 inv_12_ac_power_inv_149638 inv_13_dc_current_inv_149639 inv_13_dc_voltage_inv_149640 inv_13_ac_current_inv_149641 inv_13_ac_voltage_inv_149642 inv_13_ac_power_inv_149643 inv_14_dc_current_inv_149644 inv_14_dc_voltage_inv_149645 inv_14_ac_current_inv_149646 inv_14_ac_voltage_inv_149647 inv_14_ac_power_inv_149648 inv_15_dc_current_inv_149649 inv_15_dc_voltage_inv_149650 inv_15_ac_current_inv_149651 inv_15_ac_voltage_inv_149652 inv_15_ac_power_iinv_149653 inv_16_dc_current_inv_149654 inv_16_dc_voltage_inv_149655 inv_16_ac_current_inv_149656 inv_16_ac_voltage_inv_149657 inv_16_ac_power_inv_149658 inv_17_dc_current_inv_149659 inv_17_dc_voltage_inv_149660 inv_17_ac_current_inv_149661 inv_17_ac_voltage_inv_149662 inv_17_ac_power_inv_149663 inv_18_dc_current_inv_149664 inv_18_dc_voltage_inv_149665 inv_18_ac_current_inv_149666 inv_18_ac_voltage_inv_149667 inv_18_ac_power_inv_149668 inv_19_dc_current_inv_149669 inv_19_dc_voltage_inv_149670 inv_19_ac_current_inv_149671 inv_19_ac_voltage_inv_149672 inv_19_ac_power_inv_149673 inv_20_dc_current_inv_149674 inv_20_dc_voltage_inv_149675 inv_20_ac_current_inv_149676 inv_20_ac_voltage_inv_149677 inv_20_ac_power_inv_149678 inv_21_dc_current_inv_149679 inv_21_dc_voltage_inv_149680 inv_21_ac_current_inv_149681 inv_21_ac_voltage_inv_149682 inv_21_ac_power_inv_149683 inv_22_dc_current_inv_149684 inv_22_dc_voltage_inv_149685 inv_22_ac_current_inv_149686 inv_22_ac_voltage_inv_149687 inv_22_ac_power_inv_149688 inv_23_dc_current_inv_149689 inv_23_dc_voltage_inv_149690 inv_23_ac_current_inv_149691 inv_23_ac_voltage_inv_149692 inv_23_ac_power_inv_149693 inv_24_dc_current_inv_149694 inv_24_dc_voltage_inv_149695 inv_24_ac_current_inv_149696 inv_24_ac_voltage_inv_149697 inv_24_ac_power_inv_149698
count 632952 632586.000000 632586.000000 632586.000000 632586.000000 632586.000000 632586.000000 632586.000000 632586.000000 632586.000000 632586.000000 632952.000000 632952.000000 632952.000000 632952.000000 632952.000000 631224.000000 631224.000000 6.312240e+05 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 6.312240e+05 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 6.312240e+05 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.00000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.00000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000 631224.000000
unique 632952 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
top 2017-11-01 00:00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
freq 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
mean NaN 9.856498 319.205209 7.454513 136.362110 6.363338 9.228654 316.131588 6.988613 135.249976 5.982282 9.988669 330.187014 7.551154 145.218504 6.500057 9.336884 304.037668 1.481321e+05 132.865773 6.130343 9.840238 7.499344 138.567962 6.415813 7.224951e+04 301.613943 6.793178 130.691349 5.834427 8.898383 2.186520e+06 6.830158 127.442930 5.879789 10.144463 328.710893 7.720310 141.890221 6.564618 10.387627 336.468661 7.983449 146.715587 6.821695 10.280984 328.001489 7.812005 142.448472 6.64902 10.623234 333.753824 8.177335 145.354794 6.990957 10.369173 335.621148 7.954767 146.398971 6.829824 10.554597 338.963935 8.043897 145.087225 6.903008 10.157293 329.584876 7.830556 142.653776 6.681862 10.266145 329.817723 7.831213 144.259179 6.72965 10.005688 329.340681 7.638052 142.126222 6.537007 9.878494 324.686964 7.553521 139.201330 6.438483 10.250447 334.395949 7.780927 146.123299 6.717585 9.866144 333.206971 7.470421 147.162844 6.424750 10.706515 339.977851 8.209786 146.701772 7.049536 9.804948 334.120535 7.511014 143.407506 6.410051 10.397607 333.731323 7.970044 147.475516 6.891622 10.028295 332.493139 7.700779 142.157008 6.577954 9.691090 327.085068 7.449268 139.982626 6.382338
std NaN 15.343311 337.317409 11.367359 143.161743 9.933949 14.928043 336.273826 10.973514 143.443901 9.607771 15.339447 334.748411 11.270075 144.960139 9.909333 15.152162 333.683578 1.176847e+08 143.867354 9.856842 15.321611 11.375410 142.926061 9.924226 5.739487e+07 334.856753 10.972734 143.742364 9.589361 14.891727 1.736950e+09 11.158162 143.129792 9.771875 15.703293 337.010445 11.590759 143.204191 10.089930 15.708819 336.542095 11.730733 144.598272 10.228337 15.790630 332.460844 11.652194 142.733681 10.12290 15.865384 335.553021 11.886606 143.180846 10.345037 15.546061 335.799193 11.645162 144.286116 10.165567 15.817337 338.306413 11.808397 144.022326 10.308671 15.628730 336.143930 11.674743 143.691883 10.183399 15.578807 335.386514 11.607848 144.374176 10.13432 15.385801 335.939452 11.411198 143.705472 9.968686 15.541239 337.158979 11.521450 143.156845 10.045097 15.489888 335.898679 11.457513 144.840569 10.073827 15.067273 334.048667 11.053540 144.806703 9.726843 15.867650 337.684988 11.867573 144.030676 10.391237 15.101541 339.753089 11.178806 143.603348 9.806287 15.587829 335.732962 11.614257 145.023864 10.244947 15.605688 338.846880 11.595841 143.247243 10.158701 15.295462 338.176004 11.389543 143.089715 10.000710
min NaN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% NaN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
50% NaN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.012000 292.489000 0.000000 267.740000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.059000 365.220000 0.000000 274.893000 0.000000 0.000000 290.267000 0.000000 9.492000 0.00000 0.000000 295.036000 0.000000 272.493000 0.000000 0.055000 360.026500 0.000000 274.743000 0.000000 0.137000 375.090500 0.000000 270.529000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 295.486000 0.000000 272.878000 0.000000 0.000000 296.772000 0.000000 275.343000 0.000000 0.085000 403.089000 0.000000 274.374000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 294.618000 0.000000 275.956000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
75% NaN 16.029000 675.261000 12.696000 286.213000 10.763000 13.966000 672.873000 11.175000 287.051000 9.538000 16.900000 675.077000 13.392000 289.901000 11.498000 13.982250 673.173000 1.132400e+01 288.043000 9.655000 16.287000 12.993000 285.753000 11.061250 1.268700e+01 672.072000 10.155000 288.095000 8.657250 12.233000 6.701010e+02 9.932000 287.208000 8.478000 17.018000 676.447000 13.509000 286.344000 11.482000 18.315250 679.437000 14.678000 289.136000 12.533000 17.596000 668.330000 13.966000 285.490000 11.90100 19.333000 680.255000 15.445000 286.514000 13.185000 18.623000 679.737000 14.838000 288.694000 12.717000 18.870000 681.574000 14.943000 288.199000 12.806000 17.475250 677.182000 14.070000 287.443000 12.005000 18.152000 677.533000 14.424000 288.668000 12.36500 17.189000 677.064000 13.739000 287.286000 11.722000 16.242000 675.562000 12.984000 286.265000 11.070000 18.120000 678.852000 14.346000 289.616000 12.341000 17.091000 675.144000 13.542000 289.603000 11.629000 19.472250 681.875000 15.524000 288.030000 13.286000 17.026250 680.793250 13.600000 287.064000 11.597000 18.652000 680.806000 14.845000 290.004000 12.773250 16.707000 680.205000 13.384000 286.370000 11.432250 15.544250 679.236000 12.518000 286.016000 10.718000
max NaN 52.348000 909.840000 36.363000 310.677000 30.096000 54.844000 1750.755000 36.103000 310.508000 30.088000 55.819000 909.740000 36.218000 310.544000 30.092000 54.468000 1487.769000 9.350000e+10 347.127000 103.710000 53.569000 36.400000 310.036000 30.085000 4.560000e+10 912.095000 36.171000 311.329000 30.086000 54.274000 1.380000e+12 35.929000 311.498000 105.341000 54.730000 913.648000 36.514000 310.508000 30.089000 54.277000 921.714000 36.264000 310.448000 30.089000 55.843000 915.635000 36.538000 310.303000 30.08800 54.468000 912.462000 36.492000 310.218000 30.089000 54.959000 914.967000 36.140000 311.112000 30.089000 54.212000 912.412000 36.224000 310.436000 30.088000 53.665000 916.954000 36.397000 309.455000 30.091000 53.516000 916.036000 35.926000 310.544000 30.08600 53.721000 914.466000 35.910000 310.448000 30.086000 54.437000 914.683000 36.483000 310.605000 30.088000 54.046000 914.917000 35.869000 310.931000 30.086000 53.717000 914.165000 35.944000 310.387000 30.096000 55.095000 914.533000 36.249000 312.112000 30.095000 54.409000 925.054000 36.168000 310.617000 30.082000 53.762000 913.731000 35.678000 311.558000 30.093000 54.575000 913.080000 35.848000 310.822000 30.106000 55.051000 912.795000 36.066000 310.605000 30.089000
==================================================
Descripción del dataset: Environment Data
==================================================
Número de filas: 206008
Número de columnas: 4

Primeras 5 filas:
measured_on ambient_temperature_o_149575 wind_speed_o_149576 wind_direction_o_149577
0 2017-12-01 0:00:00 38.8 1.2 156.0
1 2017-12-01 0:15:00 38.8 1.2 156.0
2 2017-12-01 0:30:00 38.8 1.2 156.0
3 2017-12-01 0:45:00 38.8 1.2 156.0
4 2017-12-01 1:00:00 37.0 2.6 247.0
Resumen de tipos de datos y valores faltantes:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 206008 entries, 0 to 206007
Data columns (total 4 columns):
 #   Column                        Non-Null Count   Dtype  
---  ------                        --------------   -----  
 0   measured_on                   206008 non-null  object 
 1   ambient_temperature_o_149575  205876 non-null  object 
 2   wind_speed_o_149576           205992 non-null  object 
 3   wind_direction_o_149577       206000 non-null  float64
dtypes: float64(1), object(3)
memory usage: 6.3+ MB
None
Estadísticas descriptivas:
measured_on ambient_temperature_o_149575 wind_speed_o_149576 wind_direction_o_149577
count 206008 205876 205992 206000.000000
unique 206008 836 272 NaN
top 2017-12-01 0:00:00 61.6 2.7 NaN
freq 1 616 4888 NaN
mean NaN NaN NaN 187.551757
std NaN NaN NaN 98.382701
min NaN NaN NaN 0.000000
25% NaN NaN NaN 123.000000
50% NaN NaN NaN 162.000000
75% NaN NaN NaN 270.000000
max NaN NaN NaN 360.000000
==================================================
Descripción del dataset: Irradiance Data
==================================================
Número de filas: 531019
Número de columnas: 2

Primeras 5 filas:
measured_on poa_irradiance_o_149574
0 2017-11-01 07:10:00 0.0
1 2017-11-01 07:15:00 0.0
2 2017-11-01 07:20:00 0.0
3 2017-11-01 07:25:00 0.0
4 2017-11-01 09:00:00 267.5
Resumen de tipos de datos y valores faltantes:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 531019 entries, 0 to 531018
Data columns (total 2 columns):
 #   Column                   Non-Null Count   Dtype         
---  ------                   --------------   -----         
 0   measured_on              531019 non-null  datetime64[ns]
 1   poa_irradiance_o_149574  516584 non-null  float64       
dtypes: datetime64[ns](1), float64(1)
memory usage: 8.1 MB
None
Estadísticas descriptivas:
measured_on poa_irradiance_o_149574
count 531019 516584.000000
mean 2020-10-22 23:51:20.311250688 255.862654
min 2017-11-01 07:10:00 0.000000
25% 2019-05-03 11:52:30 0.000000
50% 2020-09-15 08:35:00 27.600000
75% 2022-04-15 09:27:30 503.500000
max 2023-11-01 23:55:00 1400.000000
std NaN 341.748415
In [3]:
from utils import analyze_structure 

# 1.2 Estructura de los conjuntos de datos
electrical_data = analyze_structure(electrical_data, "Electrical Data")
environment_data = analyze_structure(environment_data, "Environment Data")
irradiance_data = analyze_structure(irradiance_data, "Irradiance Data")
Análisis de estructura para: Electrical Data

Columnas de fecha/hora identificadas: ['measured_on']

Rango temporal:
Fecha mínima: 2017-11-01 00:00:00
Fecha máxima: 2023-11-07 23:55:00
Duración total: 2197 days 23:55:00

Columnas numéricas (119):
['inv_01_dc_current_inv_149579', 'inv_01_dc_voltage_inv_149580', 'inv_01_ac_current_inv_149581', 'inv_01_ac_voltage_inv_149582', 'inv_01_ac_power_inv_149583', 'inv_02_dc_current_inv_149584', 'inv_02_dc_voltage_inv_149585', 'inv_02_ac_current_inv_149586', 'inv_02_ac_voltage_inv_149587', 'inv_02_ac_power_inv_149588', 'inv_03_dc_current_inv_149589', 'inv_03_dc_voltage_inv_149590', 'inv_03_ac_current_inv_149591', 'inv_03_ac_voltage_inv_149592', 'inv_03_ac_power_inv_149593', 'inv_04_dc_current_inv_149594', 'inv_04_dc_voltage_inv_149595', 'inv_04_ac_current_inv_149596', 'inv_04_ac_voltage_inv_149597', 'inv_04_ac_power_inv_149598', 'inv_05_dc_current_inv_149599', 'inv_05_ac_current_inv_149601', 'inv_05_ac_voltage_inv_149602', 'inv_05_ac_power_inv_149603', 'inv_06_dc_current_inv_149604', 'inv_06_dc_voltage_inv_149605', 'inv_06_ac_current_inv_149606', 'inv_06_ac_voltage_inv_149607', 'inv_06_ac_power_inv_149608', 'inv_07_dc_current_inv_149609', 'inv_07_dc_voltage_inv_149610', 'inv_07_ac_current_inv_149611', 'inv_07_ac_voltage_inv_149612', 'inv_07_ac_power_inv_149613', 'inv_08_dc_current_inv_149614', 'inv_08_dc_voltage_inv_149615', 'inv_08_ac_current_inv_149616', 'inv_08_ac_voltage_inv_149617', 'inv_08_ac_power_inv_149618', 'inv_09_dc_current_inv_149619', 'inv_09_dc_voltage_inv_149620', 'inv_09_ac_current_inv_149621', 'inv_09_ac_voltage_inv_149622', 'inv_09_ac_power_inv_149623', 'inv_10_dc_current_inv_149624', 'inv_10_dc_voltage_inv_149625', 'inv_10_ac_current_inv_149626', 'inv_10_ac_voltage_inv_149627', 'inv_10_ac_power_inv_149628', 'inv_11_dc_current_inv_149629', 'inv_11_dc_voltage_inv_149630', 'inv_11_ac_current_inv_149631', 'inv_11_ac_voltage_inv_149632', 'inv_11_ac_power_inv_149633', 'inv_12_dc_current_inv_149634', 'inv_12_dc_voltage_inv_149635', 'inv_12_ac_current_inv_149636', 'inv_12_ac_voltage_inv_149637', 'inv_12_ac_power_inv_149638', 'inv_13_dc_current_inv_149639', 'inv_13_dc_voltage_inv_149640', 'inv_13_ac_current_inv_149641', 'inv_13_ac_voltage_inv_149642', 'inv_13_ac_power_inv_149643', 'inv_14_dc_current_inv_149644', 'inv_14_dc_voltage_inv_149645', 'inv_14_ac_current_inv_149646', 'inv_14_ac_voltage_inv_149647', 'inv_14_ac_power_inv_149648', 'inv_15_dc_current_inv_149649', 'inv_15_dc_voltage_inv_149650', 'inv_15_ac_current_inv_149651', 'inv_15_ac_voltage_inv_149652', 'inv_15_ac_power_iinv_149653', 'inv_16_dc_current_inv_149654', 'inv_16_dc_voltage_inv_149655', 'inv_16_ac_current_inv_149656', 'inv_16_ac_voltage_inv_149657', 'inv_16_ac_power_inv_149658', 'inv_17_dc_current_inv_149659', 'inv_17_dc_voltage_inv_149660', 'inv_17_ac_current_inv_149661', 'inv_17_ac_voltage_inv_149662', 'inv_17_ac_power_inv_149663', 'inv_18_dc_current_inv_149664', 'inv_18_dc_voltage_inv_149665', 'inv_18_ac_current_inv_149666', 'inv_18_ac_voltage_inv_149667', 'inv_18_ac_power_inv_149668', 'inv_19_dc_current_inv_149669', 'inv_19_dc_voltage_inv_149670', 'inv_19_ac_current_inv_149671', 'inv_19_ac_voltage_inv_149672', 'inv_19_ac_power_inv_149673', 'inv_20_dc_current_inv_149674', 'inv_20_dc_voltage_inv_149675', 'inv_20_ac_current_inv_149676', 'inv_20_ac_voltage_inv_149677', 'inv_20_ac_power_inv_149678', 'inv_21_dc_current_inv_149679', 'inv_21_dc_voltage_inv_149680', 'inv_21_ac_current_inv_149681', 'inv_21_ac_voltage_inv_149682', 'inv_21_ac_power_inv_149683', 'inv_22_dc_current_inv_149684', 'inv_22_dc_voltage_inv_149685', 'inv_22_ac_current_inv_149686', 'inv_22_ac_voltage_inv_149687', 'inv_22_ac_power_inv_149688', 'inv_23_dc_current_inv_149689', 'inv_23_dc_voltage_inv_149690', 'inv_23_ac_current_inv_149691', 'inv_23_ac_voltage_inv_149692', 'inv_23_ac_power_inv_149693', 'inv_24_dc_current_inv_149694', 'inv_24_dc_voltage_inv_149695', 'inv_24_ac_current_inv_149696', 'inv_24_ac_voltage_inv_149697', 'inv_24_ac_power_inv_149698']

Columnas categóricas (0):
[]

Análisis de estructura para: Environment Data

Columnas de fecha/hora identificadas: ['measured_on']

Rango temporal:
Fecha mínima: 2017-12-01 00:00:00
Fecha máxima: 2023-10-31 23:45:00
Duración total: 2160 days 23:45:00

Columnas numéricas (1):
['wind_direction_o_149577']

Columnas categóricas (2):
['ambient_temperature_o_149575', 'wind_speed_o_149576']

Análisis de estructura para: Irradiance Data

Columnas de fecha/hora identificadas: ['measured_on']

Rango temporal:
Fecha mínima: 2017-11-01 07:10:00
Fecha máxima: 2023-11-01 23:55:00
Duración total: 2191 days 16:45:00

Columnas numéricas (1):
['poa_irradiance_o_149574']

Columnas categóricas (0):
[]
In [4]:
from utils import check_missing_duplicates

# 1.3 Identificación de valores faltantes y duplicados
check_missing_duplicates(electrical_data, "Electrical Data")
check_missing_duplicates(environment_data, "Environment Data")
check_missing_duplicates(irradiance_data, "Irradiance Data")
==================================================
Análisis de valores faltantes y duplicados: Electrical Data
==================================================

Valores faltantes por columna:
Valores faltantes Porcentaje (%)
inv_13_ac_voltage_inv_149642 1728 0.273006
inv_13_ac_power_inv_149643 1728 0.273006
inv_19_dc_current_inv_149669 1728 0.273006
inv_18_ac_power_inv_149668 1728 0.273006
inv_18_ac_voltage_inv_149667 1728 0.273006
... ... ...
inv_02_dc_voltage_inv_149585 366 0.057824
inv_02_ac_current_inv_149586 366 0.057824
inv_02_ac_voltage_inv_149587 366 0.057824
inv_02_ac_power_inv_149588 366 0.057824
inv_01_dc_current_inv_149579 366 0.057824

114 rows × 2 columns

Número de filas duplicadas: 0

Análisis de ceros e infinitos en columnas numéricas:
Valores cero Valores infinitos
inv_01_dc_current_inv_149579 332145 0
inv_01_dc_voltage_inv_149580 329923 0
inv_01_ac_current_inv_149581 335279 0
inv_01_ac_voltage_inv_149582 329929 0
inv_01_ac_power_inv_149583 348575 0
... ... ...
inv_24_dc_current_inv_149694 322325 0
inv_24_dc_voltage_inv_149695 320692 0
inv_24_ac_current_inv_149696 333555 0
inv_24_ac_voltage_inv_149697 320695 0
inv_24_ac_power_inv_149698 353250 0

119 rows × 2 columns

==================================================
Análisis de valores faltantes y duplicados: Environment Data
==================================================

Valores faltantes por columna:
Valores faltantes Porcentaje (%)
ambient_temperature_o_149575 132 0.064075
wind_speed_o_149576 16 0.007767
wind_direction_o_149577 8 0.003883
Número de filas duplicadas: 0

Análisis de ceros e infinitos en columnas numéricas:
Valores cero Valores infinitos
wind_direction_o_149577 312 0
==================================================
Análisis de valores faltantes y duplicados: Irradiance Data
==================================================

Valores faltantes por columna:
Valores faltantes Porcentaje (%)
poa_irradiance_o_149574 14435 2.718358
Número de filas duplicadas: 0

Análisis de ceros e infinitos en columnas numéricas:
Valores cero Valores infinitos
poa_irradiance_o_149574 224014 0

2.- Análisis descriptivo¶

In [5]:
from utils import descriptive_stats

# 2.1 Estadísticas descriptivas para variables numéricas
descriptive_stats(electrical_data, "Electrical Data")
descriptive_stats(environment_data, "Environment Data")
descriptive_stats(irradiance_data, "Irradiance Data")
==================================================
Estadísticas descriptivas detalladas: Electrical Data
==================================================
count mean std min 1% 5% 25% 50% 75% 95% 99% max IQR Coef. Variación Skewness Kurtosis
inv_01_dc_current_inv_149579 632586.0 9.856498 15.343311 0.0 0.0 0.0 0.0 0.0 16.02900 43.543 48.148 52.348 16.02900 1.556670 1.327263 0.208461
inv_01_dc_voltage_inv_149580 632586.0 319.205209 337.317409 0.0 0.0 0.0 0.0 0.0 675.26100 729.021 755.558 909.840 675.26100 1.056742 0.140434 -1.935645
inv_01_ac_current_inv_149581 632586.0 7.454513 11.367359 0.0 0.0 0.0 0.0 0.0 12.69600 32.467 34.189 36.363 12.69600 1.524896 1.271537 0.030629
inv_01_ac_voltage_inv_149582 632586.0 136.362110 143.161743 0.0 0.0 0.0 0.0 0.0 286.21300 294.232 297.480 310.677 286.21300 1.049865 0.099705 -1.986636
inv_01_ac_power_inv_149583 632586.0 6.363338 9.933949 0.0 0.0 0.0 0.0 0.0 10.76300 28.332 29.939 30.096 10.76300 1.561122 1.293413 0.088258
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
inv_24_dc_current_inv_149694 631224.0 9.691090 15.295462 0.0 0.0 0.0 0.0 0.0 15.54425 43.149 47.634 55.051 15.54425 1.578302 1.335543 0.210588
inv_24_dc_voltage_inv_149695 631224.0 327.085068 338.176004 0.0 0.0 0.0 0.0 0.0 679.23600 738.023 810.203 912.795 679.23600 1.033908 0.113733 -1.914641
inv_24_ac_current_inv_149696 631224.0 7.449268 11.389543 0.0 0.0 0.0 0.0 0.0 12.51800 32.567 34.173 36.066 12.51800 1.528948 1.280345 0.046233
inv_24_ac_voltage_inv_149697 631224.0 139.982626 143.089715 0.0 0.0 0.0 0.0 0.0 286.01600 293.530 296.925 310.605 286.01600 1.022196 0.046172 -1.994632
inv_24_ac_power_inv_149698 631224.0 6.382338 10.000710 0.0 0.0 0.0 0.0 0.0 10.71800 28.543 29.957 30.089 10.71800 1.566935 1.297150 0.088821

119 rows × 16 columns

==================================================
Estadísticas descriptivas detalladas: Environment Data
==================================================
count mean std min 1% 5% 25% 50% 75% 95% 99% max IQR Coef. Variación Skewness Kurtosis
wind_direction_o_149577 206000.0 187.551757 98.382701 0.0 4.0 23.0 123.0 162.0 270.0 346.0 357.0 360.0 147.0 0.524563 0.131442 -0.969694
==================================================
Estadísticas descriptivas detalladas: Irradiance Data
==================================================
count mean std min 1% 5% 25% 50% 75% 95% 99% max IQR Coef. Variación Skewness Kurtosis
poa_irradiance_o_149574 516584.0 255.862654 341.748415 0.0 0.0 0.0 0.0 27.6 503.5 950.7 1027.7 1400.0 503.5 1.335671 1.026265 -0.463018
In [6]:
# 2.2 Visualización de variables continuas (histogramas)

from utils import plot_histograms

plot_histograms(electrical_data, "Electrical Data")
plot_histograms(environment_data, "Environment Data")
plot_histograms(irradiance_data, "Irradiance Data")
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [7]:
from utils import plot_categorical 

# 2.3 Visualización de variables categóricas (gráfico de barras)
plot_categorical(electrical_data, "Electrical Data")
plot_categorical(environment_data, "Environment Data")
plot_categorical(irradiance_data, "Irradiance Data")
No description has been provided for this image
In [8]:
from utils import plot_correlations

# 2.4 Análisis de correlaciones
plot_correlations(electrical_data, "Electrical Data")
plot_correlations(environment_data, "Environment Data")
plot_correlations(irradiance_data, "Irradiance Data")
No description has been provided for this image
Correlaciones fuertes (>0.7 o <-0.7):
inv_12_ac_current_inv_149636  inv_12_ac_power_inv_149638      0.999389
inv_12_ac_power_inv_149638    inv_12_ac_current_inv_149636    0.999389
inv_06_ac_power_inv_149608    inv_06_ac_current_inv_149606    0.999371
inv_06_ac_current_inv_149606  inv_06_ac_power_inv_149608      0.999371
inv_15_ac_current_inv_149651  inv_15_ac_power_iinv_149653     0.999371
                                                                ...   
inv_17_dc_current_inv_149659  inv_07_ac_current_inv_149611    0.806824
inv_20_ac_current_inv_149676  inv_22_dc_voltage_inv_149685    0.702169
inv_22_dc_voltage_inv_149685  inv_20_ac_current_inv_149676    0.702169
inv_01_ac_current_inv_149581  inv_01_ac_voltage_inv_149582    0.702125
inv_01_ac_voltage_inv_149582  inv_01_ac_current_inv_149581    0.702125
Length: 6904, dtype: float64
No hay suficientes variables numéricas para calcular correlaciones en Environment Data
No hay suficientes variables numéricas para calcular correlaciones en Irradiance Data

Análisis Temporal¶

In [9]:
from utils import analyze_temporal_patterns

# 3.1 Identificación de patrones temporales
electrical_data_temp = analyze_temporal_patterns(electrical_data, "Electrical Data")
environment_data_temp = analyze_temporal_patterns(environment_data, "Environment Data")
irradiance_data_temp = analyze_temporal_patterns(irradiance_data, "Irradiance Data")
==================================================
Análisis temporal: Electrical Data
==================================================
No description has been provided for this image
Descomposición temporal para inv_01_dc_current_inv_149579:
No description has been provided for this image
Descomposición temporal para inv_01_dc_voltage_inv_149580:
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
==================================================
Análisis temporal: Environment Data
==================================================
No description has been provided for this image
Descomposición temporal para wind_direction_o_149577:
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
==================================================
Análisis temporal: Irradiance Data
==================================================
No description has been provided for this image
Descomposición temporal para poa_irradiance_o_149574:
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
<Figure size 1200x400 with 0 Axes>
No description has been provided for this image
In [10]:
from utils import prepare_temporal_data

# 3.2 Preparación de datos para análisis posterior

electrical_data_prep = prepare_temporal_data(electrical_data_temp, "Electrical Data")
environment_data_prep = prepare_temporal_data(environment_data_temp, "Environment Data")
irradiance_data_prep = prepare_temporal_data(irradiance_data_temp, "Irradiance Data")
==================================================
Preparación de datos temporales: Electrical Data
==================================================

Completando valores faltantes...

Normalizando frecuencia temporal...
Frecuencia inferida: None
Usando frecuencia calculada: 300.0S
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:306: FutureWarning: 'S' is deprecated and will be removed in a future version, please use 's' instead.
  df_resampled = df_temp[numeric_cols].resample(freq).mean()
Extrayendo características temporales básicas...
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:318: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_diff'] = df_resampled[col].diff()
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:319: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df_resampled[f'{col}_pct_change'] = df_resampled[col].pct_change(fill_method=None)
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:306: FutureWarning: 'S' is deprecated and will be removed in a future version, please use 's' instead.
  df_resampled = df_temp[numeric_cols].resample(freq).mean()
Datos preparados con:
- Frecuencia normalizada: 300.0S
- Valores faltantes completados
- Características temporales añadidas
- Variables de diferencia calculadas

==================================================
Preparación de datos temporales: Environment Data
==================================================

Completando valores faltantes...

Normalizando frecuencia temporal...
Frecuencia inferida: None
Usando frecuencia calculada: 900.0S

Extrayendo características temporales básicas...

Datos preparados con:
- Frecuencia normalizada: 900.0S
- Valores faltantes completados
- Características temporales añadidas
- Variables de diferencia calculadas

==================================================
Preparación de datos temporales: Irradiance Data
==================================================

Completando valores faltantes...

Normalizando frecuencia temporal...
Frecuencia inferida: None
Usando frecuencia calculada: 300.0S
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:306: FutureWarning: 'S' is deprecated and will be removed in a future version, please use 's' instead.
  df_resampled = df_temp[numeric_cols].resample(freq).mean()
Extrayendo características temporales básicas...

Datos preparados con:
- Frecuencia normalizada: 300.0S
- Valores faltantes completados
- Características temporales añadidas
- Variables de diferencia calculadas
In [11]:
from utils import detect_anomalies
# Electrical Data: Potencia de un inversor
electrical_data_anomalies = detect_anomalies(electrical_data_prep, 'inv_01_ac_power_inv_149583', method='isolation_forest')

# Environment Data: Temperatura ambiente
environment_data_anomalies = detect_anomalies(environment_data_prep, 'wind_direction_o_149577', method='lof')

# Irradiance Data: Irradiancia
print(irradiance_data_prep.head())
irradiance_data_anomalies = detect_anomalies(irradiance_data_prep, 'poa_irradiance_o_149574', method='zscore')
🔍 Método: ISOLATION_FOREST
Número de anomalías detectadas: 6327.0 (1.00%)
Parámetros usados: {'contamination': 0.01,'threshold': N/A}

🔍 Método: LOF
Número de anomalías detectadas: 28.0 (0.01%)
Parámetros usados: {'contamination': 0.01,'threshold': N/A}

                     poa_irradiance_o_149574  hour  day_of_week  day_of_month  \
measured_on                                                                     
2017-11-01 07:10:00                      0.0     7            2             1   
2017-11-01 07:15:00                      0.0     7            2             1   
2017-11-01 07:20:00                      0.0     7            2             1   
2017-11-01 07:25:00                      0.0     7            2             1   
2017-11-01 07:30:00                      NaN     7            2             1   

                     month  is_weekend  poa_irradiance_o_149574_diff  \
measured_on                                                            
2017-11-01 07:10:00     11           0                           NaN   
2017-11-01 07:15:00     11           0                           0.0   
2017-11-01 07:20:00     11           0                           0.0   
2017-11-01 07:25:00     11           0                           0.0   
2017-11-01 07:30:00     11           0                           NaN   

                     poa_irradiance_o_149574_pct_change  
measured_on                                              
2017-11-01 07:10:00                                 NaN  
2017-11-01 07:15:00                                 NaN  
2017-11-01 07:20:00                                 NaN  
2017-11-01 07:25:00                                 NaN  
2017-11-01 07:30:00                                 NaN  
🔍 Método: ZSCORE
Número de anomalías detectadas: 48.0 (0.01%)
Parámetros usados: {'contamination': 0.01,'threshold': 3}

In [12]:
from utils import prophet_anomaly_detection
# 1. Sin regresores (solo la variable eléctrica)
prophet_electrical = prophet_anomaly_detection(electrical_data_prep, 'inv_01_ac_power_inv_149583', samples=1000)
prophet_environment = prophet_anomaly_detection(environment_data_prep, 'wind_direction_o_149577', samples=1000)
prophet_irradiance = prophet_anomaly_detection(irradiance_data_prep, 'poa_irradiance_o_149574', samples=1000)
11:18:22 - cmdstanpy - INFO - Chain [1] start processing
🔧 Entrenando modelo (esto puede tomar tiempo)...
11:18:23 - cmdstanpy - INFO - Chain [1] done processing
📊 Generando predicciones...
✅ Completado. Anomalías detectadas: 3

🔍 REPORTE COMPLETO DE ANOMALÍAS
==================================================

📊 RESUMEN GENERAL
Total de puntos analizados: 1000
Anomalías detectadas: 3 (0.30%)

📈 ESTADÍSTICAS DE ANOMALÍAS
Residual máximo: -7.63
Residual mínimo: -8.34
Desviación estándar promedio: -3.14

🚨 TOP 5 ANOMALÍAS MÁS GRANDES
ds y_true yhat residual residual_std
104 2017-11-01 08:40:00 0.0 7.634967 -7.634967 -3.020069
105 2017-11-01 08:45:00 0.0 7.865792 -7.865792 -3.111374
107 2017-11-01 08:55:00 0.0 8.340238 -8.340238 -3.299044
⏳ DISTRIBUCIÓN TEMPORAL

Anomalías por hora del día:
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:464: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

hour
8    3
Name: count, dtype: int64
Anomalías por día de la semana:
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:469: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

weekday
Wednesday    3
Name: count, dtype: int64
📉 GRÁFICO DE RESIDUALES ESTANDARIZADOS
No description has been provided for this image
🔧 Entrenando modelo (esto puede tomar tiempo)...
📊 Generando predicciones...
✅ Completado. Anomalías detectadas: 2

🔍 REPORTE COMPLETO DE ANOMALÍAS
==================================================

📊 RESUMEN GENERAL
Total de puntos analizados: 1000
Anomalías detectadas: 2 (0.20%)

📈 ESTADÍSTICAS DE ANOMALÍAS
Residual máximo: 278.38
Residual mínimo: -271.63
Desviación estándar promedio: 0.04

🚨 TOP 5 ANOMALÍAS MÁS GRANDES
ds y_true yhat residual residual_std
911 2017-12-10 11:45:00 358.0 79.621006 278.378994 3.091858
495 2017-12-06 03:45:00 6.0 277.631787 -271.631787 -3.016919
⏳ DISTRIBUCIÓN TEMPORAL

Anomalías por hora del día:
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:464: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

hour
3     1
11    1
Name: count, dtype: int64
Anomalías por día de la semana:
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:469: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

weekday
Wednesday    1
Sunday       1
Name: count, dtype: int64
📉 GRÁFICO DE RESIDUALES ESTANDARIZADOS
No description has been provided for this image
🔧 Entrenando modelo (esto puede tomar tiempo)...
📊 Generando predicciones...
✅ Completado. Anomalías detectadas: 10

🔍 REPORTE COMPLETO DE ANOMALÍAS
==================================================

📊 RESUMEN GENERAL
Total de puntos analizados: 1000
Anomalías detectadas: 10 (1.00%)

📈 ESTADÍSTICAS DE ANOMALÍAS
Residual máximo: 368.11
Residual mínimo: -401.76
Desviación estándar promedio: -0.06

🚨 TOP 5 ANOMALÍAS MÁS GRANDES
ds y_true yhat residual residual_std
367 2017-11-02 13:45:00 1031.8 663.687523 368.112477 3.463090
379 2017-11-02 14:45:00 848.7 495.111618 353.588382 3.326452
390 2017-11-02 15:40:00 671.8 326.969840 344.830160 3.244057
391 2017-11-02 15:45:00 651.6 312.835275 338.764725 3.186995
393 2017-11-02 15:55:00 622.7 285.490534 337.209466 3.172364
⏳ DISTRIBUCIÓN TEMPORAL

Anomalías por hora del día:
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:464: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

hour
11    1
13    3
14    3
15    3
Name: count, dtype: int64
Anomalías por día de la semana:
/home/chris/Documentos/python_projects/mineria_datos/proyecto/utils.py:469: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

weekday
Thursday    10
Name: count, dtype: int64
📉 GRÁFICO DE RESIDUALES ESTANDARIZADOS
No description has been provided for this image